Progress in Camera-Based Document Image Analysis
نویسندگان
چکیده
The increasing availability of high performance, low priced, portable digital imaging devices has created a tremendous opportunity for supplementing traditional scanning for document image acquisition. Digital cameras attached to cellular phones, PDAs, or as standalone still or video devices are highly mobile and easy to use; they can capture images of any kind of document including very thick books, historical pages too fragile to touch, and text in scenes; and they are much more versatile than desktop scanners. Should robust solutions to the analysis of documents captured with such devices become available, there is clearly a demand from many domains. Traditional scanner-based document analysis techniques provide us with a good reference and starting point, but they cannot be used directly on camera-captured images. Camera captured images can suffer from low resolution, blur, and perspective distortion, as well as complex layout and interaction of the content and background. In this paper we present a survey of application domains, technical challenges and solutions for recognizing documents captured by digital cameras. We begin by describing typical imaging devices and the imaging process. We discuss document analysis from a single camera-captured image as well as multiple frames and highlight some sample applications under development and feasible ideas for future development.
منابع مشابه
Document Image Dewarping Based on Text Line Detection and Surface Modeling (RESEARCH NOTE)
Document images produced by scanner or digital camera, usually suffer from geometric and photometric distortions. Both of them deteriorate the performance of OCR systems. In this paper, we present a novel method to compensate for undesirable geometric distortions aiming to improve OCR results. Our methodology is based on finding text lines by dynamic local connectivity map and then applying a l...
متن کاملDocument Analysis And Classification Based On Passing Window
In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...
متن کاملDocument Image Retrieval Based on Keyword Spotting Using Relevance Feedback
Keyword Spotting is a well-known method in document image retrieval. In this method, Search in document images is based on query word image. In this Paper, an approach for document image retrieval based on keyword spotting has been proposed. In proposed method, a framework using relevance feedback is presented. Relevance feedback, an interactive and efficient method is used in this paper to imp...
متن کاملCurrent Status and Future Prospects of Camera-Based Character Recognition and Document Image Analysis
Pervasive use of handy digital cameras with higher resolution is now defining new roles of character recognition and document image analysis as a mean of analyzing camera-captured images. In this report, we survey state-of-the-art of research and technologies of camera based character recognition and document image analysis. We also describe the current position and future prospects of characte...
متن کاملUnwarping Images of Curved Documents Using Global Shape Optimization
The unwarping of curved document images is a crucial problem for camera-based document analysis since most of current OCR techniques can not handle distortion due to perspective and warping. In previous work we have shown how to recover the page shape from a single image using an iterative procedure without camera calibration, and using the shape information to restore a frontal view of a flat ...
متن کامل